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Abstract 



Models for noncoherent error control in random linear network coding (RLNC) and store and 
forward (SAF) have been recently proposed. In this paper, we model different types of random network 
communications as the transmission of flats of matroids. This novel framework encompasses RLNC and 
SAF and allows us to introduce a novel protocol, referred to as random affine network coding (RANC), 
based on affine combinations of packets. Although the models previously proposed for RLNC and SAF 
only consider error control, using our framework, we first evaluate and compare the performance of 
different network protocols in the error-free case. We define and determine the rate, average delay, and 
throughput of such protocols, and we also investigate the possibilities of partial decoding before the 
entire message is received. We thus show that RANC outperforms RLNC in terms of data rate and 
throughput thanks to a more efficient encoding of messages into packets. Second, we model the possible 
alterations of a message by the network as an operator channel, which generalizes the channels proposed 
for RLNC and SAF. Error control is thus reduced to a coding-theoretic problem on flats of a matroid, 
where two distinct metrics can be used for error correction. We study the maximum cardinality of codes 
on flats in general, and codes for error correction in RANC in particular. We finally design a class of 
nearly optimal codes for RANC based on rank metric codes for which we propose a low-complexity 
decoding algorithm. The gain of RANC over RLNC is thus preserved with no additional cost in terms 
of complexity. 

I. Introduction 

During transmission through a network, the data can be modified, as in network coding, without 
affecting their decoding. However, other modifications, such as packets in error or lost, corrupt the nature 
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of the transmitted message. Recently, operator channels have been proposed to differentiate these two 
types of modifications for data transmission using random linear network coding (RLNC) |1] and store 
and forward (SAF) GJ, respectively. For RLNC, it is shown that data transmission is equivalent to the 
communication of a linear subspace of a given vector space flj; for SAF, however, a subset of a set is 
transmitted (2j. Using these operator channels, noncoherent error correction in RLNC and SAF can be 
reduced to coding theoretic problems on linear subspaces and subsets, respectively. 

In this paper, we generalize the models described above by viewing random data transmission through 
a network as the communication of a flat of a matroid. Matroids [3 ] can be viewed as the combinatorial 
essence of independence, and hence are a generalization of linear independence; flats of a matroid can 
be viewed as generalizations of linear subspaces. Studying matroids allows to focus on the combinatorial 
aspects of independence and combinations, without assuming any underlying algebraic structure. Although 
the models for RLNC and SAF were introduced for error control, our matroid framework allows us to 
study protocols for both the error-free case and the case where error control is considered. The matroids 
associated to RLNC and SAF are easily determined and are well-known. In particular, we shall show 
that the matroid for RLNC — the projective geometry — only considers a fraction of packets (around q , 
where packets are viewed as vectors over GF(q)), hence leading to a rate loss of around one symbol per 
packet. 

In order to thwart this rate loss, we introduce a new way to combine packets for network coding, 
referred to as random affine network coding (RANC), where packets are viewed as points instead of 
vectors and new packets are created via affine combinations. The associated matroid is the well-known 
and thoroughly studied affine geometry, whose flats are affine subspaces of an affine space. Unlike RLNC, 
which only considers a fraction of all packets, RANC works on all possible packets, thus utilizing a better 
encoding of messages into flats. Moreover, since affine combinations are particular linear combinations, 
the complexity at the intermediate nodes is not increased. At the receiver end, the message can be decoded 
using Gaussian elimination, for an affine subspace is no more than a translated linear subspace. Therefore, 
utilizing RANC instead of RLNC does not increase the complexity at the source, the intermediate nodes, 
or the destinations. 

Then, using our matroid framework, we determine, evaluate, and compare the performances of different 
network protocols. We first define the data rate of a matroid as the ratio between the amount of information 
carried by the flat, i.e. the logarithm of the number of flats, and the size of the message transmitted through 
the network. We also investigate the average delay of a matroid, which reduces to the coupon collector 
problem for SAF. Combining these two parameters, we also define the throughput of a matroid as the 
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proportion of useful information received by the destination. We show that RANC outperforms RLNC in 
terms of data rate, while offering a similar average delay, thus yielding a higher throughput of around one 
symbol per packet. We then study the delay in more detail via the average number of independent packets 
in a given number of received packets. We hence demonstrate that this number tends to the optimum 
for RLNC and RANC when the field size increases, while the number of independent packets in SAF 
follows an exponential recovery. We finally investigate the possibilities of partial decoding. We prove 
that partial decoding is highly unlikely in RLNC and RANC, while for SAF all packets are decodable. 
Therefore, RLNC and RANC follow a zero-one pattern: no packets can be decoded before receiving 
the total number of packets in a message, and once this amount is received, the whole message can be 
decoded. On the other hand, SAF follows an exponential recovery in terms of partially decodable packets. 

The study described above considers an error-free transmission through the network. In the presence 
of errors, message alterations (packets lost, injected, or in error) correspond to modifications of the 
transmitted flat. The network can hence be viewed as an operator channel, which generalizes the channels 
defined in [1 ] and [2] for RLNC and SAF, respectively. We then introduce two metrics for error correction 
in random network communications. These metrics, referred to as the lattice distance and the modified 
lattice distance, respectively, are identified with previously proposed metrics for RLNC [4] and SAF |2j. 
We also place constant-dimension codes used for RLNC |TJ and constant-weight codes used for SAF 
Q into the new framework of matroid codes, which are codes on flats of a matroid sharing the same 
rank. We then investigate error control for RANC with codes on affine subspaces. We derive bounds on 
the maximum cardinality of such codes and determine a nearly optimal class of codes based on liftings 
of rank metric codes [5]. We finally design a decoding algorithm for these codes based on, and with 
the same order of complexity as the decoder proposed in |T| for RLNC. The rate gain of RANC over 
RLNC is therefore preserved when error correction is considered, and at no additional cost in terms of 
complexity. 

We summarize the advantages of our matroid framework below. 

• First, this framework is very general, and offers a unified approach for distinct problems such as 
SAF, RLNC, and RANC. It offers to focus on the combinatorial properties of network protocols, in 
terms of both combinations and encoding. Also, associating a matroid to a protocol provides with a 
new tool to study and compare the performances of different protocols for both the error-free case 
and when error control is enforced. 

• Second, different properties of a protocol arising from matroid theory can be discovered. For example, 
we demonstrate how RLNC can be viewed as a matroid on only a fraction of all possible packets, and 
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hence determine the data rate and the actual number of possible combinations offered by RLNC. The 
lattice distance also illustrates the easiest way to alter a message, hence highlighting the sensitivity 
of network coding to errors. 

• Third, when studying error control, the advantages of using an operator channel still apply to our 
general framework. Although the matroid depends on the protocol, it is independent of the actual 
network, rendering our approach noncoherent and robust to network topology changes. Moreover, 
errors on the message level, such as packets lost or injected, and errors on the packet level (bits or 
symbol errors) can be detected and corrected using the same class of codes. The problem of error 
control can be eventually tackled using methods from algebraic coding, such as binary constant- 
weight codes or rank metric codes. 

• Fourth, our model offers a wealth of alternatives to the protocols already proposed in the literature, 
as many different types of matroids have been previously discovered and studied. One of these 
alternatives introduced here, RANC, is shown to outperform RLNC. Different matroids may lead to 
different tradeoffs between the number of possible combinations and the data rate. Also, it is known 
that linear network coding is not optimal when multiple sources are considered |6], then non-linearly 
representable matroids may offer a higher throughput than RLNC or RANC in these cases. 

The rest of the paper is organized as follows. Section In] reviews some necessary backgrounds on 



matroids and error correction models for RLNC and SAF. In Section |ITT} we introduce the model based 
on matroids for error-free communications. Section |TV] introduces and illustrates random affine network 
coding. In Section [V| we evaluate and compare the different performance parameters of matroids. In 



Section VI we model the alterations of the message into an operator channel, and study the codes used 



for error correction in random network communications. Finally, Section VII details possible extensions 
of our work. 

II. Preliminaries 

A. Matroids 

We review below the definition and major properties of matroids and their flats. Although the concepts 
introduced below arise from matroid theory, they all are generalizations of well-known concepts in linear 
algebra. For an extensive account on matroid theory, the interested reader is referred to J3j. 

For any set E, we denote the set of subsets of E with cardinality < i < \E\ as V(E, i) and its power 
set as V(E) = \J i %'P(E,i). A matroid is a pair M = (E,l), where E and 1 C V{E) are referred to 
as the ground set and the independent sets of M., respectively. The independent sets are generalizations 
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of linearly independent vectors and satisfy the following three axioms: G X; if A G I and B C A, 
then B £ Z; if Ii, I2 € I with > Ify, then there exists e G /1V2 such that I2 U {e} G I. The third 
axiom, referred to as the independence augmentation axiom, is crucial as it guarantees that any family 
of independent elements can be extended to form a basis (a maximal family of independent elements). 
Clearly, all bases have the same cardinality 

To any matroid is associated a rank function rk(A) for all A C E, defined as the maximum number 
of independent elements in A. For any two subsets A, B C E, we have the submodular inequality 
ik{AVJ B) + rk(An B) < rk(A) + rk(B). The rank of a matroid is simply the rank of its ground set, and 
is the number of elements in any basis. The closure cl(A) of a subset A of the ground set is then defined 
as the maximal subset B C E such that B contains A and rk(B) = rk(A). The closure is unique, as it 
can be shown that cl(^4) = {e G E : rk(A U e) = rk(^4)} j3j Eq. 1.4.1]. The closure and the rank are 
generalizations of the span of a set of vectors and the dimension of that span, respectively. 

A flat is a set equal to its closure, which is a generalization of a linear subspace. In particular, we refer 
to any flat of rank r — 1 in a matroid of rank r as a hyperplane. By extension, we refer to any family of 
k independent elements in a flat of rank k as a basis of that flat. The set of flats of a matroid, denoted 
as F, is closed under intersection. We also denote the set of flats of rank k as Tk and its cardinality as 
Nk for any < k < r. Furthermore, the set of flats ordered by inclusion forms a lattice (in the partially 
ordered set sense), where the meet of two flats is their intersection, and their join is the closure of their 
union. 

A matroid may contain loops and parallel elements. A loop I is an element of the ground set belonging 
to no independent set: {/} ^ I; alternatively, I belongs to the closure of the empty set. A collection of 
elements are said to be parallel if they are pairwise dependent: {ei,ej} ^ T for i ^ j; they hence all 
belong to a set with rank 1. A loop and parallel elements are generalizations of the all-zero vector and 
collinear vectors, respectively. A matroid is said to be simple if it does not contain any loops or parallel 
elements. For any matroid Ai, the simple matroid obtained by removing all loops and keeping only one 
element in each set of parallel elements of Ai has the same lattice of flats as At. For any simple matroid, 
we have T = {0}, N = 1 and T x = V(E, 1), N x = \E\. 

We now review three important classes of matroids. First, the free matroid on r elements, classically 
denoted as U r ,r, has [r] = {0, 1, . . . , r — 1} as a ground set, and any subset of [r] is independent. Clearly, 
this matroid is simple, has rank r, and any subset of [r] is a flat. 

Second, the projective geometry PG(r — l,q) has all the non-zero vectors of GF(g) r with leading 
nonzero coefficient equal to 1 as ground set, where linear independence is used. This matroid is simple, 
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has rank r and its flats are in one-to-one correspondence with the linear subspaces of GF(g) r . Therefore, 
the number of flats of rank k is given by the Gaussian binomial [£] = YiiZo g fc -'-i ' which satisfies 
q k(r-k) < ^rj < K ^i q k(r~k) for a n < < r , where K q = \{f =1 {l - g _i ) < 1 tends to 1 when q tends 
to infinity (7J. 

Third, removing a hyperplane from PG(r — 1, g) yields the affine geometry AG(r — 1, g). This matroid 
is also simple with rank r and its flats are the affine subspaces of GF(g) r_1 ; there are 9 r_fe [fcZi] flats 
of rank k for all < k < r. Any affine subspace with rank k can be represented by a linear subspace 
with dimension k — 1 translated by a point belonging to a complementary linear subspace. By definition, 
AG(r— 1, g) is a submatroid of PG(r — 1, g), and can be viewed as a matroid on the points in GF(g) r , 
where two points u, v are affinely independent if and only if the vectors (l,u),(l,v) € GF(g) r are 
linearly independent. 



B. Error control for RLNC and SAF 

We now review the existing models for error correction in RLNC and SAF given in (T| and (2j, 
respectively. For RLNC, several techniques have been proposed for error correction (see J8j, (9j for 
coherent error correction); however, we are interested here in the operator channel approach introduced 
in (T| for noncoherent error control. Suppose a message, encoded into k linearly independent packets 
in GF(g) n , is transmitted through a network using RLNC. Since the linear combinations operated by 
the intermediate nodes do not modify the subspace spanned by the packets, RLNC is viewed as the 
transmission of a linear subspace of dimension k of GF(g) n . The alterations of the message (packets 
lost, injected, or in error) hence correspond to modifications of that subspace. The transmission of a 
message using RLNC is hence modeled as an operator channel which modifies the input subspace sent 
by the source into the output subspace received by the destination. Accordingly, codes on subspaces, and 
more especially codes on a Grassmannian referred to as constant-dimension codes, have been proposed for 
error correction in RLNC. Two metrics between subspaces have been proposed: the subspace metric and 
the injection metric R). The maximum cardinality of a constant-dimension code, consisting of subspaces 
of GF(g) n with dimension k, with minimum injection distance d (and equivalently, minimum subspace 
distance 2d) is between g min{fc(n-fc-d+i),(n-fc)(fc-d+i)} and K -l q min{k(n-k-d+l),( n -k){k-d+l)} _ These 

bounds follow the Singleton bound in (T| and the inequalities on the Gaussian binomial above and were 



tightened in (T0J-(I2J. 

A possible and highly practical construction of constant-dimension codes, referred to as liftings of 
rank metric codes, has been proposed in |13|. Rank metric codes (5J, [14|, [15| are codes on matrices 
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in GF(q) kxu , where the rank distance between two matrices is simply the rank of their difference. The 
number of matrices with rank r in GF(q) kxu is given by [*] nl=o ~ 9*) f^J- The maximum cardinality 
of a rank metric code in GF(q) kxu with minimum rank distance d is given by g mm { fc ( iy - ci + 1 ).^( fc - d + 1 )} 
and is achieved by Gabidulin codes |5J, an analogue of Reed-Solomon codes. For any M £ GF(g) fcxiy , 
the linear lifting ic(M) of M is the row space of the matrix (I&|M), a subspace of GF{q) k+v with 
dimension k [13]. The injection distance between two liftings of matrices is equal to the rank distance 
between the matrices, hence the lifting of a rank metric code has the same minimum injection distance as 
the original code. In particular, lifings of Gabidulin codes are nearly optimal constant-dimension codes 
for which low-complexity decoding algorithms were proposed JlJ, p3| . 

Similarly, an operator channel has been proposed for error correction in SAF in [2]. Suppose k packets 
in [q] n are transmitted through a network with SAF, where we denote [q] = {0, 1, ... ,q — 1} for any 
integer q. Also, assume the packets arrive at the destination in a different order to which they were sent 
in. Then only the set of packets is preserved, and SAF is modeled as the transmission of a subset of 
cardinality k of [q n ]. Codes on subsets have hence been proposed for error control in SAF with two 
distinct metrics: the Hamming metric and the modified Hamming metric. Since subsets of [q n ] are in 
bijection with vectors in GF(2) g ", codes on subsets can be viewed as binary codes; in particular, codes 
on subsets with the same cardinality can be viewed as binary constant-weight codes. 

Similarly to the case of constant-dimension codes, a practical construction of constant-weight codes 
with length q n and weight q l is the lifting of a nonrestricted Hamming metric code in GF(q n ~ l ) q ' . The 
lifting is(X) of any word X = (Xq, X\, . . . , X q i_i) £ [q n ~ l ] ql is obtained by added the header i, encoded 
into I symbols of [q], in front of the packet corresponding to X; for all < i < q — 1. Alternatively, it 
is the subset {xo, xi, ... , 6 V([q n ), q l ), where Xi = iq n ~ l + Xj for < % < q l - 1. The lifting 

is preserves the Hamming distance: dnCfe (X), is (Y)) = 2g?h(X, Y), and liftings of nonrestricted 
Hamming metric codes can be used for error control with SAF. 

III. Transmission model 

A. Model and discussion 

In this section, we introduce a noncoherent communication model based on matroids for error-free data 
transmission through a network. We consider a source wishing to transmit a message M in the alphabet 
[A] = {0, 1, . . . , A — 1} through a network toward a set of destinations. Let (E,I) be a simple matroid 
and denote its set of flats of rank k as for all k, and assume that both the source and the destination 
know a common injective map G from [A] to Fk- 
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The error-free data transmission follows three steps. 

• Step I: at the source. The source encodes the original message M into a flat / = G{M) G Fu- Then 
a stream of elements of / containing a basis of / is transmitted into the network. 

• Step II: in the network. Each intermediate node combines the elements it has previously received 
by selecting and retransmitting elements of their closure. 

• Step III: at each destination. The destination waits until it receives a basis of /, and then recovers 
the original message by determining M = G~ 1 (f). 

We now provide several remarks regarding the matroids and the flats used in our model. First, we 
consider flats of a matroid, for the matroid structure ensures that the rank function is well-behaved. 
Indeed, a flat of rank k can only be described by k independent elements, no less and no more. Also, 



the independence augmentation axiom reviewed in Section II-A guarantees that any set of less than k 
independent elements can be extended into a basis of k elements of the flat. 

Second, a non-simple matroid contains loops and parallel elements. By definition, a loop belongs 
to every flat and hence does not carry any information about the transmitted flat. Also, two parallel 
elements belong to the same flats and are combined in the same way, hence they carry the same 
information. Therefore, loops and parallel elements are unnecessary to the destination, and our assumption 
of considering simple matroids only does not lead to any loss of generality. 

Third, although flats of any rank may be sent, the following two reasons justify our assumption to 
send flats of the same rank k only. Foremost, no transmitted flat is properly contained by another, 
thus rendering the decoding non-ambiguous. Also, the destination always expects the same number of 
independent elements to start decoding, hence simplifying the decoding process. 

Fourth, the number of possible combinations for an intermediate node is given by the cardinality of 
the closure of the elements it has received. However, not all flats of the same rank necessarily have the 
same cardinality and the same number of bases, which results in different protections to packet losses. 
However, as shown below, SAF and RLNC use matroids for which all flats of the same rank have equal 



cardinalities. Matroids satisfying this property are referred to as perfect matroid designs | 16 1, 1 17 Section 
3.4]. Due to their highly specific structure, very few classes of perfect matroid designs are known so far. 
When considering a perfect matroid design, we shall denote the cardinality of any flat of rank k as Ck 
henceforth, where Co = and Ci = 1 for any simple perfect matroid design. 

We also comment on the validity of our model and on some practical issues regarding its realization. 
Our model is general and does not take advantage of any knowledge of the network topology. It is hence 
noncoherent, and is robust to network topology variations, such as node or link appearance/disappearance. 
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Accordingly, the intermediate nodes are assumed to operate blindly on the elements they receive, regard- 
less of the source, the destination, or the actual transmitted data. 

In terms of practical implementation, without loss of generality, we assume that an element is encoded 
in one packet of length n over GF(g). All intermediate nodes should have an efficient algorithm to 
combine elements; this combination algorithm, however, does not guarantee to yield a new basis of the 
flat. This operation can be viewed as a form of random sampling on the elements of a flat. Also, the 
destination needs an efficient algorithm to retrieve the original message from any basis of the flat. For a 
general matroid, efficient algorithms may not exist; however, we shall only consider matroids for which 
combining elements can be done efficiently. 

B. Matroids for SAF and RLNC 

We now determine the matroids associated to SAF and RLNC. 

First, for SAF, the only combination possible is the selection of an element, hence the flats are the 
subsets of cardinality k of [q n ]. The associated matroid is the free matroid U q n^ q n and we have E = [q n ], 
Fu = V(E, k) for all < k < q n , and hence Nk = ( 9 fc ) and Cfc = k. In order to use notations reflecting 
the protocol and the alphabet and length of packets, we denote U q n jqn as S(q, n) or simply S when there 
is no ambiguity. 

Second, our model differs slightly from the purely random linear combinations typically proposed 
for RLNC. Indeed, a linear combination may yield the all-zero vector or collinear vectors: these are 
respectively loops and parallel elements. However, our model considers the simple matroid associated to 
RLNC, which is the projective geometry PG(n — l,q). Its ground set E is the set of one-dimensional 
subspaces of GF(q) n , and Tk is a Grassmannian for < k < n, and hence = [?] and = [^] . 
Clearly, the combinations operated by the intermediate nodes are linear combinations which ensure the 
output vector is non-zero and has leading non-zero coefficient equal to 1, while decoding the message 
at the destination is achieved via Gaussian elimination. We denote PG(n — 1, q) as £(q, n) or simply C 
when there is no ambiguity henceforth. 

IV. Random affine network coding 

In this section, we introduce a novel network coding scheme, referred to as random affine network 
coding (RANC), where packets are viewed as points in an affine space and where intermediate nodes 
combine packets by affine combinations. An affine combination of points vq, Vi, . . . , Vfc_i G GF(q) n is 
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any sum of the form Yli=o a « v i> where the scalars <n G GF(g) n satisfy Y^Zo a i = 1- in other words, 
an affine combination corresponds to determining the centroid of the points Vj with masses aj. 

The set of all possible affine combinations of a collection of points, referred to as the affine hull, 
forms an affine subspace. The matroid associated to RANC is hence given by the affine geometry 
AG(n,q), which we will denote as A(q,n) or simply A if there is no ambiguity about the param- 
eter values. A collection of points vo, vi, . . . , v&_i G GF(g) n are said to be affinely independent if 
^2iZo for all 6jS not all zero and satisfying X)to &i = 0. By definition, the rank of a set of 

points is given by the number of affinely independent points, and is equal to the rank of their affine 
hull. For any vo, vi, . . . , Vk-l £ GF(g) n , we then have rk(vo, vi, . . . , v^-i) = rank(l|V), where 
V = (vq , vf , . . . , v^_ 1 ) T and rank denotes the number of linearly independent rows of a matrix. The 
set J-fc of flats of rank k being the set of affine subspaces of rank k, we have Nf. = qr n ~ fc + 1 and 
C k = q^ 1 for 1 < k < n + 1 g Section 6.2]. 

We now provide guidelines for the implementation of RANC. First, encoding messages (viewed as 
the rows rrij of a matrix M G GF(q) kx ( n ~ k+1 >) into affinely independent points can be simply done by 
adding the header I' k = (0|Ifc_i) T to obtain (I' fc |M) G GF(q) kxn . We shall refer to this encoding as the 
affine lifting of the matrix M. Second, since affine combinations are particular linear combinations, the 
complexity of using RANC at the intermediate nodes is no higher than using RLNC. Third, we describe 
the decoding algorithm at the destination, thus showing that this does not increase complexity either. 
Suppose the destination receives k affinely independent points Vo, Vi, . . . , Vk-i, then the first k columns 
of (1|V) G GF(g) fcx ( n+1 ) are linearly independent. Therefore, Gaussian elimination on (1|V) yields 
(Ifc|M'), where M' = (idq, (mi — mo) T , . . . , {mk-l — m o) T ) T - The decoding is finished by adding 
mo to all the other rows of M'. The complexity of the algorithm is hence dominated by the inversion 
of a matrix of order k, which is similar to the complexity for RLNC. We finally note that the Gaussian 
elimination could be modified in order to obtain the matrix (l|I' fe |M) directly. 

As seen in Section III-B| the simple matroid associated to RLNC is the projective geometry with 



rank n, whose alphabet only has m ~ q n ~ l elements. This implies a loss in terms of data rate, as 
the elements are not optimally encoded into packets of length n. Similarly, any linear subspace has 
LI ~ q^ 1 elements, which compared to the q k possible linear combinations, leads to a decrease in 
the number of possible combinations. These issues are immediate consequences of the existence of a 
loop (the all-zero vector) and parallel elements (collinear vectors). Unlike RLNC, the matroid associated 
to RANC has rank n + 1 and q n elements. By construction, A(q, n) = AG(n, q) is a submatroid of 
£(q,n + 1) = PG(n,q). However, we shall demonstrate in the following that A(q,n) behaves closely 
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to £(q, n + 1), hence virtually allowing to work on packets of length n + 1 instead of n. 

We illustrate the difference between RLNC and RANC by using the butterfly network, depicted in 
Figure [T] where the source S wants to transmit two messages m and n over GF(q) to the destinations D\ 
and L>2- First, suppose RLNC is used. The source then encodes these messages into linearly independent 
vectors with their first non-zero coordinate equal to one by adding the following headers: x = (10|m) 
and y = (01 |n). The only linear combination of one vector is simply the vector itself; all the linear 
combinations of x and y can be expressed as x + ay, where a E GF(g). There are hence q combinations 
possible, and q — 1 lead to successful decoding at the destinations (if a ^ 0), leading to a success 
probability of 2=1, which tends to 1 for large q. We remark that our model is consistent with the typical 
approach of RLNC, which allows combinations of the type ax + by instead. 

Now suppose RANC is used. The source then encodes the messages into affinely independent points by 
adding the following headers: u = (0|m) and v = (l|n). Note that the header is one symbol long only, 
illustrating the gain of one symbol per packet of utilizing RANC over RLNC. The only affine combination 
of one point is the point itself; all the affine combinations of u and v can be expressed as 6u + (1 — 6)v, 
where b G GF(g). Therefore, there are q combinations possible, and q — 2 lead to successful decoding at 
the destinations (if b ^ {0, 1}) and the success probability is 2=2. This probability is zero for the binary 
field, since in the very particular case of two points in a binary affine geometry, RANC actually reduces 
to SAF. However, for large q, it tends to 1 nearly as fast as its counterpart for RLNC. Finally, note that the 
decoding of the messages at destination D\, which receives the points (0|m) and (1 — b\bm + (1 — 6)n), 



is straightforward (similarly for D2): construct the matrix 
Gaussian elimination yields 
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m 


1 1-6 


6m + (1 — 6)n 








m 
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— m + n 



which after 



and obtain m and n. 



V. Parameters for error-free networks 

A. General assumptions 

In this section, we define, determine, and compare some performance parameters of different matroids, 
hence leading to a performance comparison of different network protocols. In order to carry out this 
study, we need to make the following assumption to the model described in Section [ITTJ Since our model 
is noncoherent, it makes the network topology and the statistical dependency amongst packets due to the 
order of combinations transparent at the message level. Accordingly, we suppose that each destination 
receives elements chosen independently and uniformly amongst all elements of the flat. This assumption 
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(b) RANC 

Fig. 1. Transmitting data on the butterfly network using RLNC or RANC 



can be viewed as a generalization of the multiplicative matrix channel proposed for RLNC in (18J. 



Moreover, it is motivated by file dissemination [19] and is similar to the setting in [20|. Note, however, 



that the study in |20|, [21] is based on simple independence assumptions for RLNC and considers delays 
for the whole set of destinations, while we shall derive fine results for any matroid by viewing each 
destination separately. Furthermore, we believe this assumption provides a good intuition on how the 
protocols behave and it allows for the thorough performance study below. The parameters we introduce 
illustrate the impact of the number of possible combinations offered by different protocols in terms of 
data rate, delay, and partial decoding. 

We comment on the term "error-free" used in the title of this section. Although the destination does 
not necessarily recover the whole transmitted flat immediately, it keeps receiving elements of the flat and 
hence it will almost surely be able to reconstruct the whole flat sent by the source. The term error-free 
indicates that no other flat of the same rank can be reconstructed by the destination, and hence only the 
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message sent by the source can be decoded. 

B. Matroid rate, average delay, and throughput 

The data rate of the communication is given by the ratio between the amount of information decoded 
and the amount of data needed to transmit a flat: ^jr— = ^ C ode(^' -M.)R(M., k), where R CO( ] e (A, A4) = 
iog gg JVfc can ^ e v i ewe d as the rate of the code formed by all the possible transmitted flats and the matroid 
rate is defined as 

R(M,k) = ^-±. (1) 
nk 

We remark that R CO( \ e {A, Ai) only depends on the encoding of the message into a flat, and does not 
depend on the actual matroid (we only require Nk > A). Therefore, we only focus on the matroid rate 
henceforth, which indicates how efficiently a flat of rank k is encoded into a message of k packets. We 
can further decompose the matroid rate into R(M.,k) = ^ • logq J E ^ , where the first ratio is an 
intrinsic property of the matroid, while the second ratio indicates how efficiently a matroid element is 
encoded into a packet. Note that the rate is entirely determined by the lattice of flats of M. , and does not 
depend on the cardinalities of flats. Proposition [T] below determines the matroid rates of SAF, RLNC, 
and RANC. 

Proposition I (Matroid rate of SAF, RLNC, and RANC): The matroid rates of SAF and RLNC are 

respectively given by 

l°gn k , log„ ( q , ) log,, k — log,, e 

1 ^<R(S,k) = q ) k) < 1 5?_ 

n nk n 

i- k -<R ( c,k) = jg«La <!-*+■-*•*■, 

n nk n 

1 .tiA <a(Ak) _ "-* + i + '°^.] <1 _*-i + '° i y^ 

n nk n 

Proof: For SAF, the rate is determined by jlj and N k = since (^) k < < (2^)* : , we 

obtain the bounds on R(S,k). For RLNC, we have N k = [I] and q k ( n ~ k ) < [™] < K q l q k{ - n - k \ as 

For RANC, we have N k = q n ~ k + 1 [ fe ^J . ■ 



reviewed in Section 



II-A 



RANC allows a gain in terms of rate over RLNC of about one symbol per packet, due to the increase 
in the number of flats from around q k ( n ~ k ) to around q k ( n - k + 1 )_ This gain follows the fact that RLNC 
only considers around q^ 1 of all possible packets of length n, while RANC considers all possible packets. 



According to the assumptions made in Section V-A the packets arrive at the destination at random 



Therefore, the number of packets to be received in order to obtain k independent packets, referred to 
as the delay of a transmission, is a random variable. Clearly, the minimum delay is exactly k, while 
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the maximum delay is unbounded. We hence define the average delay of a transmission as the expected 
number of packets received in order to obtain k independent packets. Clearly, D(Ai,k) > k for any 
matroid A4. By generalizing the approach typically used to solve the coupon collector problem (22j, we 

obtain for a perfect matroid design where all flats of rank k have cardinality for all k, 

fc-i 

D(M,k) = J2 T ^c[- (2) 

i=o c k 

We now determine the value of the average delay for SAF, RLNC, and RANC. 
Proposition 2 (Average delay of SAF, RLNC, and RANC): The average delays of SAF, RLNC, and 
RANC are respectively given by 

k 1 
k(\ogk + j) < D(S, k) = k^i 1 < fc(logfc + 7 ) + -, 

1=1 

' 1 1 - q j ~ k , q 



k<D(C,k) = k + ^—L T <k + J -^ W2 , 



3=1 
fc-1 



k<D(A,k) = k + Y^ j 1 < k + 



Q 



; '/•' i (<^-l) 2, 

where 7 0.577 is Euler's constant. 

Proposition [2] indicates that in RLNC and RANC, the expected number of packets needed to decode 
the subspace completely tends to the rank of the subspace as q tends to infinity. The delay of RANC 
is very close to that of RLNC since, by (|2]), the average delay is determined by the cardinality of flats, 
which only changes from 2-£^ for RLNC to q^ 1 for RANC. 

We now define the throughput of a matroid as the ratio between the amount of transmitted information 
over the amount of data received on average by the destination. In other words, it measures the proportion 
of useful information in each packet received by the destination. By definition, the throughput is given 

T(M k) - l ° g « Nk - k R{M ' k) (3) 
HM ' k) - nD(M,k) ~ k D(M,k)- (3) 

This provides an indication on the desirable properties of a matroid for network communications. By ([3]), 

a matroid should maintain a low average delay, while trying to maximize its data rate. By ([2]), minimizing 

the average delay is equivalent to minimizing the ratio ^ ; also, by |T| the matroid rate increases with 

the number of flats Np.. A matroid should hence have a large number of flats, whose cardinalities increase 

rapidly with their ranks. 
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Combining the results in Propositions [T] and |2j the throughputs of SAF, RLNC, and RANC are 
respectively around 

T(S,k)~-L.-—!—, T{C,k)~l-K T{A,k)~l- — . (4) 
log K n log q n n 

By Q, the throughputs of RLNC and RANC are higher for small values of k, but decrease linearly with 

k. On the other hand, the throughout of SAF only decreases with the logarithm of k, hence this protocol 

is more appropriate for messages with a large number of packets, which confirms the assumptions made 

in |2]. The increase in rate and the constant delay between RANC and RLNC lead to a gain in throughput 

of one symbol per packet in Q. 

C. Number of received independent packets 

We now investigate the delay in higher detail by considering the random variable Xj(M, k;r) given 
by the number of informative packets received by the destination once r packets have been received. 
The variable r — Xi(M, k; r) measures the random redundancy inherent to the noncoherent transmission. 
Therefore, Px{Xj(M., k; r) = 1} is equal to the probability Pj(M, k; r, I) to obtain I independent packets 
once r packets have been received. An important special case is given by the probability Pj(M, k; k, k) 
to receive all the necessary independent packets to reconstruct the flat with minimum delay. Proposition 
[5] below determines a recursive way of computing Pj(M.,k;r,l). 

Proposition 3 (Probability of independence): We have Pj(A4, k;r,0) = and for r > 0, 

Pi(M, k;r + l,l + l)= (l-^A Pj(M, k; r, I) + 9h±Pj(M,k; r,l + l). 

In particular, P/(M, k; k, k) = nf=i (l-gA. 

Proof: In order to obtain I + 1 independent packets after receiving r + 1 packets, one must have 
received either I or l + l independent packets in the first r received packets. Hence Pt(M, k; r+1, l+l) = 
PoPi(Ai, k; r, I) + p\Pi(M., k;r,l + 1), where po = Ck c^ 1 ^ s tne probability to receive a packet outside 
of a flat of rank I and p\ = is the probability to receive a packet inside of a flat of rank l + l. 
Applying this recursion successively for I = r yields Pi(Ai, k; k, k). ■ 
We derive closed-form formulas of the probability of independence for SAF, RLNC, and RANC in 
Proposition [4] below. 
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Proposition 4 (Probability of independence for SAF, RLNC, and RANC): We have for all Z > 1 



Pj{S,k;r,l) 
Pj(C,k;r,l) 
Pl(A,k;r,l) 



k r (k-l) 



fi] 



r{l}^B- 1 »-G)/. 

l-l 



s=0 



(fc_l)( r _l) 



r 

s 

1-2 



i=0 



i=0 



where { T ; } is a Stirling number of the second kind [23]. In particular, 



y^k e " fc+ ^TT < P/(5, k; k, k) 



K q < P!(£,k;k,k) 



k\ 



k k 

k-l 

n(> 



i=l 



< 1, 



K q < P!(A,k;r,l) 



-(k-l)(r-l) 



1-2 



m- 1 



< i. 



i=0 



is the probability that all k independent packets are received with minimum delay. 

Proof: For SAF, Pj(S,k;r,l) = Sl il'^ , where S[(r,l) is the number of words of length r with / 
distinct symbols from an alphabet of size k. Any word with / distinct symbols can be put in correspondence 
with the partition of [r] into / cells, where each cell contains the positions of a given symbol in the word. 
By definition of the Stirling numbers, there are {^} such partitions. Also, once the partition is fixed, there 
are k(k — 1) • • • (k — I + 1) choices for the symbols. Combining, we obtain the formula for Pi(S, k; r, 
For r = / = k, we obtain Pi(S, k; k, k) = p-, which combined with the refinement of Stirling's formula 
in (24} yields the upper bound. 

For RLNC, we have Pj(C,k;r,l) = ^4rfi-, where Rj(r,l) is the number of matrices in GF(q) kxr 
with rank I such that all the columns are nonzero and the leading nonzero coefficient is equal to 1. The 
number of matrices with rank I and s nonzero columns is hence given by ( r s )(q — l) s Ri(s, I). Also, the 
number of matrices in GF(q) kxr with rank I is given by [^] n!=o(? r — 1 l ) 0' Summing all matrices 
with rank / and s nonzero columns for / < s < r, we obtain 

l-l 



s=l ^ ' L J i=0 



By applying the reverse binomial transform fl25) , we obtain the formula for Pi(C, k; r, I). 

For RANC, we have Pi(A,k;r,l) = ^-1] , where Ai(r, I) is the number of collections of points 
Vq, Vx, . . . , v r _i G GF(q) n in a flat of rank k with exactly / affinely independent points. We have 
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rk(v , vi, . . . , v r _i) = rank(l| V) = 1 + rank(W), where W = ((vi - v ) T , (v 2 - v ) T , . . . , (v r _i - 
vo) T ) T G GF(g)( r_1 ) x?1 is a matrix with rank I— 1 whose rows belong to a linear subspace of dimension 
k — 1. There are q k ~ x choices for vo and nto(5 r-1 ~~ 9*) choices for W, and hence Aj(r, I) = 

1 k ~ l [j-J] nS(? r-1 - <?) which leads to the result for p i(A, k; r, I). M 
We now investigate the moments of the probability distribution Pj(A4, k; r, I), in particular the expecta- 
tion Ei(Ai, k; r) and the variance Vi(M, k; r). We clearly have Ej(A4, k; r) < min{/c, r}, Ej(M, k; r) < 
E I (M,k',r + 1) < Ei(M,k;r) + 1, and linv-Kx, Ei{M, k;r) = k and lim r _ ) . 0O Vi{M, k;r) = 0. 
Proposition [5] below determines or bounds the expectation and the variance for SAF, RLNC, and RANC. 
Proposition 5 (Average number of independent elements in SAF, RLNC, and RANC): For all k and r, 



E T (S,k;r) = k 



e 



)■ 



(5) 



Also, the variance is given by 



Vi{S,k\r) = k 



1 



1 



+ k 2 



1 



1 



2r 



ke~*(l 



e k 



For RLNC and RANC, we have Pi(C, k; r, r) > K q and Pi (A, k; r, r) > K q for all r < k by Proposition 
Q Therefore, Ei(C, k; r) > K q min{r, k} and Ei(A, k; r) > K q min{r, k} for all r, and accordingly, the 
variance tends to with the field size. 

Proof: Let Si(r, I) denote the number of words of length r with / distinct symbols, then Pi(S, k; r, I) = 
k~ T Si(r,l). Consider the bipartite graph on [k] r and V([k],a) (1 < a < k), where two vertices are 
adjacent if and only if there exist a symbols of the word in [k] r equal to the a elements in [k]. Let us 
count the number of edges in this graph in two different ways. First, there are ( ) edges adjacent to a 
word in [k] r with I different coefficients, hence there are Yli= a (a)^( r ' ed § es m the graph. Second, by 
the inclusion-exclusion principle, we obtain that each subset of [k] with cardinality a appears in exactly 

E"=o(- 1 ) i (i)( fc " *) r words in \ k Y- Therefore, 

k 



which yields ([5]) for a = 1. Using this identity for a = 2 and combining, we also obtain the variance. ■ 
In particular, for r = k, ^ indicates that only around 1 — e~ x w 0.632 of the first k received packets 

are independent on average. On the other hand, for RLNC and RANC, the average number of independent 

packets tends to the optimal with the field size. 

The expected number of independent elements for RANC and SAF determined or bounded above is 

illustrated in Figure [2] for q = 2 8 , n = 20, k = 10, and 1 < r < 30. For SAF, the exponential pattern 
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Fig. 2. Expected number of independent elements as a function of the number of received elements for k — 10 transmitted 
elements 



determined in Proposition [5] is clearly displayed, while RANC is close to optimal for practical values of 
q. For k = 10, Proposition [2] indicates that the average delay is given by D(S(2 8 , 20), 10) 29.3 for 
SAF while D(A{2 8 , 20), 10) « 10.004 for RANC. 

D. Partial decoding 



The model introduced in Section III and the parameters determined so far assume the destination waits 
to receive k independent packets in order to begin the decoding procedure. However, the destination 
may choose to operate partial decoding on a fraction of k independent packets. The problem of partial 
decoding is hence as follows: Suppose that less than k independent packets have been received, how 
many packets do we expect to decode? We remark that this problem is intrinsic to the matroid, and does 
not depend on the assumptions on the model made in the introduction of Section [V] 

The destination can perform partial decoding if it knows a way of recovering the messages in all the 
elements contained in the flat it has received which were originally transmitted by the source. This is 
equivalent to transmitting elements of a canonical basis, defined as follows. A basis B(f) of a flat / is 
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a canonical basis if the elements of B(f) can be discriminated from the other elements of / and if the 
information carried by any element of B(f) can be decoded using this element only. The first property 
guarantees that the destination receiving a subflat g of the transmitted flat / will be able to determine 
all the elements transmitted by the source which lie in g, while the second property guarantees that the 
destination will decode the information carried by these elements. In other words, a canonical basis is 
a systematic encoding of the message, and the destination wishes to retrieve the systematic part of the 
elements of the canonical basis. It is still unknown which flats of a matroid have a canonical basis; 
however, it can be easily shown that the liftings associated to SAF, RLNC, and RANC have a canonical 
basis given by the set of rows of the matrix. 

We illustrate partial decoding using the linear lifting for RLNC. Suppose three messages m, n, and 
p over GF(g) are to be sent using RLNC. After linear lifting, the source transmits the following three 
vectors: (100|m), (010|n), and (001|p), which form the canonical basis of the transmitted flat. Suppose 
the destination first receives the vector (laO|m + an), which does not belong to the canonical basis and 
hence cannot be decoded. Following, suppose the destination then receives (160|m + 6n). Two members 
of the canonical basis clearly belong to the closure of the received vectors, therefore the destination can 
decode the two messages m and n before receiving the entire flat. 

Let Pf)(M.,k;l,d) be the probability to decode d elements after receiving I independent elements. 
Provided a canonical basis exists, decoding d elements is equivalent to receiving a flat containing 
d members of the canonical basis of the transmitted flat. We determine Po(M,k;l,d) under certain 
assumptions on the matroid, which are satisfied by C, S, and A. 

Proposition 6 (Probability of partial decoding): Suppose the transmitted flat of rank k contains G(l, k) 
flats of rank I for all < I < k. Furthermore, assume that for all a < I < k, any flat with rank a is 
contained in F(a,l) flats of rank /, all of them being contained in the transmitted flat. We then have 

^•* i, ^=&s- i >^(i:2) F(a, ' ) - 

Proof: The probability is given by Pd(M, k; I, d) = N Qyff , where Nd(1, d) is the number of flats 
of rank I that are contained within the transmitted flat / and which have d decodable elements. We now 
determine the value of Nr>(l,d). 

First, the set of flats of rank I with I decodable elements is given by {cl(X) : \X\ = l,X C B(f)}, 
and hence Nd(1,1) = ( k t )- Now consider the bipartite graph on the set of subflats of / of rank I and 
the set of flats of rank a with a decodable elements (a < d < I), where two vertices fi G Ti, f a € T a 
are adjacent if and only if f a C f t . We now count the edges in this graph in two ways. Since there are 
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F(a, I) edges adjacent to any flat of rank a, the number of edges is given by ( k )F(a, I). Also, there are 
(^) edges adjacent to any flat of rank I with d decodable elements, hence there are Y^d=a (a)-^^(^^) 
edges. Combining, we obtain 

W d Wz )( f)= ( k )F(a,l). (6) 

d=a 

Denoting n = (N D (l, a),N D (l, a + 1), . . . , N D (l, I)), v = ((J)F(a, I), { a k +1 )F(a + 1,1),..., (?)F(Z, /)), 
([6]) becomes nL = v, where L = (ld, a ) is a Pascal matrix: l^ a = (^) [26]. Since L _1 = (m aj d) has 
m a ,d = (-l) d+a (d)> we obtain the formula for Pn(M, k; I, d). ■ 
We remark that ^ provides the binomial moments of the Po(M,k;l,d) distribution. In particular, 
the expectation Ed{M, k; I) and the variance Vd(M, k; I) are respectively given by 

E D (M,k;l) = k F ( 1,1 ^ 



G{l,k)' 

l) F(2,l) + F(l,l) _ 2 F(l,lf 
G(l,k) G(l,kf 

Corollary 1 (Probability of partial decoding for SAF, RLNC, and RANC): For SAF, Pu(S,k',l,d) = 
Si-d for all I, d, and hence Ed{S, k;l) = I and Vd(S, k; I) = for all I. For RLNC, we have Ed(C, k; I) = 
k^Ek < kq l ~ k for all I. For RANC, we have E D (A, k; I) = kq l ~ k for all I. 

Proof: For SAF, all elements are decodable and Pjj(S,k;l,d) = 5i_d for all l,d. This can also 
be demonstrated via Proposition ^ where F(a,l) = (^) and G(l,k) = (}). For RLNC, we have 
F(a,l) = [}-*] and G(l,k) = [f] by [27, Lemma 2], which yields E D (C,k;l) = k^-. For RANC, 
we have F(0, /) = g fc "^ [JlJ] , F(o, /) = [£T°] for o > 0, and k) = q k - 1 [ k ~l) . ■ 

We remark that Ed(A, k; k — 1) = A:g _1 by Corollary [lj hence for q = 2 8 only 0.39% of the packets 
can be decoded before receiving all the packets. Therefore, for practical values of the field size, RANC 
(and also RLNC) hardly offers any opportunity of partial decoding. 

Finally, let Pt{M, k;r,d) be the probability to decode d packets given that r packets (not necessarily 
independent) have been received. Clearly, PT(Ai,k;r,d) = Y^d=o k; r, V)Pd(M., k; I, d), and 

hence we can regroup the results above to determine the probability Pt{M., k;r,d). For SAF, by Corollary 
[I] the expected number of decodable packets is given by Et{S, k; r) = Ej(S, k; r) ~ k(l — e - *). For 
RLNC and RANC, we respectively have E T (£, k; r) < E D {C, k; r) < kq r ~ k and E T {A, k; r) < kq r ~ k . 
In particular, ET(A,k;k — 1) < kq^ 1 , hence only q' 1 of the packets can be partially decoded before 
receiving k packets. RANC then follows a zero-one behavior: before receiving k packets, no decoding is 
possible; once k packets are received, they are independent with high probability and the whole message 
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Fig. 3. Expected number of decodable elements as a function of the number of received elements for k — 10 transmitted 
elements 



can be decoded. This behavior is illustrated in Figure [3] where the values of Et{S, k;l) and Et{A, k;l) 
for q = 2 8 are displayed for k = 10. 

VI. Matroid error-correcting codes 
A. Operator channel and metrics for error correction 



In Section III we modeled data communication through a network as the transmission of a flat of a 



matroid. However, the model in Section III did not take into account the possible alterations undergone by 
the message during its transmission through the network. These alterations due to the network — packet 
losses, injections, errors, etc. — modify not only the packets but also the flat being transmitted. A flat 
/ G Fk can be modified in two ways: a deletion turns / into a proper subflat of rank k — 1, while an 
insertion turns / into a proper superflat of rank k + 1. A deletion (an insertion, respectively) is hence 
equivalent to moving one step down (up, respectively) the lattice of flats. Any flat / can be turned into 
any other flat g via a sequence of insertions and deletions. The terms "insertion" and "deletion" were 
first introduced in [4] for RLNC. 
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Proposition [7J below proves that the shortest way to modify one fiat into another is to perform all the 



insertions first, and then all the deletions. This can be intuitively explained as follows. By performing 
the insertions first, the network works in larger flats with a much higher cardinality. This large number 
of combinations allows to produce bases with 'distant' elements to the original flat, and hence drift away 
from the original flat without taking steps on the lattice. It then suffices to go back down by deleting 
some elements. On the other hand, performing the deletions first implies to work in smaller flats, which 
hinders the message from drifting away from the original flat. 

Proposition 7: For any pair of flats /, g of a matroid, the union-path U(f,g), defined as starting 
from /, going up the lattice of flats to cl(/ U g), and then going back down to g, is a shortest path 
between / and g. Therefore, the shortest path distance s(f,g) between / and g is given by s(f,g) = 



Proof: Without loss of generality, suppose rk(/) > vk(g). We shall prove the claim by induction on 
s(f,g). First, the cases s(f,g) = and s(f,g) = 1 are trivial. Also, if s(f,g) = 2, then either g C / and 
the union-path is the only path of length 2, or g ^ / and the only path of length 2 distinct from the union- 
path is the intersection-path {/, / n g,g}- The intersection-path has length rk(/) + rk(g) — 2rk(/ n g); 
however, 2rk(/U<?) — rk(/) — rk(p) < rk(/) + rk(g) — 2vk(fC\g) by the submodular inequality, and hence 
the union-path is no longer than the intersection-path. Therefore, the union-path is among the shortest 
paths for s(f, g) = 2. 

Now suppose the claim is true for all pairs of flats with a shortest path of length no more than d, and 
consider / and g such that s(f,g) = d + 1. Let {/, pi, . . . ,Pd,g} be a shortest path between / and g. 
Since s(pd-i,g) = 2, we assume g C pd, without loss by the discussion above. However, s(f,pd) = d 
and hence U(f,pd) is a shortest path between / and p^. Therefore, {U(f, p^)} U {g} is a shortest path 
between / and g which first goes up the lattice and then down, and hence is equal to U(f,g). ■ 

We model data transmission through a faulty network as an operator channel, where the source transmits 
a flat / G T and the destination obtains another flat g G F, which is obtained after 5 insertions and e 
deletions, where S = rk(/ U g) — rk(/) and e = rk(/ U g) — rk(g), respectively. Accordingly, we define 
the lattice distance between / and g as 



2rk(/U<7)-rk(/)-rk(s). 



d L (f,g) 



5 + e 



2rk(/ U g) - rk(/) - vk(g) 



(7) 



< rk(/ U g) - rk(/ n g) 



(8) 



< rk(/) + rk( 5 ) - 2rk(/ n g) 



(9) 
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where ([8]) and Q follow the submodular inequality. 

For SAF, the lattice distance between two subsets is their Hamming distance; for RLNC, the lattice 
distance between two linear subspaces is their subspace distance. We remark that for both RLNC and 
SAF, we have equality in ([8]) and ([9]) for all flats. This however does not hold for all matroids; those 
which satisfy this property can all be expressed as direct sums of free matroids and projective geometries 
(3} Proposition 6.9.1]. 

Let us illustrate these remarks with the affine geometry. Two parallel hyperplanes / and g in A(q, n) 
are at lattice distance 2, while ([8]) and ([9]) yield n+1 and 2n, respectively. Furthermore, by considering 
/, g, and the whole space, it can be easily shown that the right hand sides of ([8]) and ^ violate the 
triangular inequality. This example also illustrates how the lattice distance expresses the minimum number 
of operations required to change one flat into the other. In our example, changing / into g takes only 
two operations: first insert an element not belonging to / to obtain the whole space, which has a basis 
given by r elements of g and one outside of g; then delete the latter element to obtain g. 

The modified lattice distance 

du(f,g) = max{<5, e} 

= rk(/U 5 )-min{rk(/),rk(«?)} (10) 

< max{rk(/),rk( 5 )}-rk(/n 5 ) (11) 

coincides with the modified Hamming metric for SAF (2j and with the injection distance for RLNC Q. 
Similarly to the lattice distance, equality holds in ( [TTj ) for SAF and RLNC; however, in the example 
above inequality is strict and the triangular inequality is violated by the right hand side. 

We remark that both distances only depend on the lattice of flats of the matroid. However, for any 
non-simple matroid, there exists a simple matroid with the same lattice of flats. Therefore, our assumption 
in Section [in] of considering simple matroids only does not lead to any loss in generality. 

Corollary [2] ensures that the distances defined above are metrics. Therefore, error correction for random 
network communications can be viewed as a coding theory problem, where the codewords are flats of a 
matroid associated to the network protocol and the distance between two flats is either the lattice distance 
or the modified lattice distance. 

Corollary 2: For any simple matroid with rank r, the lattice distance and the modified lattice distance 
associated to that matroid are metrics which take integer values between and r. 

Proof: The lattice distance is a metric according to Proposition [7] For the modified lattice distance, 
we have dyi{f : g) = ^d^(f,g) + j|rk(/) — rk(g)| for all flats /,g £ J 7 , and hence we 
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easily obtain that g?m is also a metric. It is clear that these metrics only have integer values between 
and r. ■ 

B. Matroid codes 

For any simple matroid Ai, we define a matroid code as a nonempty set of flats of a matroid with 
the same rank, or equivalently as a subset of Fk- The minimum lattice (modified lattice, respectively) 
distance of a matroid code is given by the minimum distance between two pairs of distinct codewords. 
By ([7]) and (10 1, the minimum lattice distance of a matroid code is twice its minimum modified lattice 



distance. All the other classical parameters of a code, such as the error correction capability, the covering 
radius, the diameter, etc. can be similarly defined. 

We now derive bounds on matroid codes which are natural generalizations of the bounds derived 
for constant-dimension codes and constant-weight codes. Indeed, the latter classes of codes are so well 
structured and can be easily bounded because they actually are matroid codes. In other words, by studying 
matroid codes, we investigate some core properties of these classes of codes. We denote the maximum 
cardinality of a matroid code on the flats of rank k of Ai with minimum modified lattice distance d (and 
hence minimum lattice distance 2d) as A(Ai, k,d). Denoting the rank of Ai as r, we first remark that 
A(Ai,k, 1) = Nfr for all < k < r. Also, since dyi(f,g) < min{fc, r — k} for all /,j 6 Fk, we shall 
use the following convention: A(Ai , k, d) = 1 for all d > min{/c, r — k}. 

Johnson bounds have first been derived for constant-weight codes (28; ] and have been adapted to 
constant-dimension codes in |10|, [12|. Proposition [8] below generalizes these bounds to the case of 



matroid codes by restricting to a submatroid of inferior rank in two ways. First, for any e G E, the 
contraction of e from Ai, denoted as Ai/e, is the simple matroid with set of flats F(Ai/e) = {/ G 
F : e G /} [3, Chapter 3]. Note that the matroid Ai/e has rank r - 1 and for all f,g G F{M/e), 
rk M/e(/) = rkvvKZ) - 1 an d hence d MM/e (f,g) = du,M(f,g)- Second, for any hyperplane h G F r -i, 
the restriction of Ai to h, denoted as A4\h, is the simple matroid with set of flats {/ G F : / C h} ||3| 
Section 1.3]. For any flats f,gQh, rk M]h (f) = vk M (f) and hence d MM \ h (f,g) = d M ,M(f^)- 

Proposition 8 (Johnson bound): For all Ai and < k < r, denote the minimum cardinality of a flat 
of rank k and the minimum number of hyperplanes containing a given flat of rank k as and h^, 
respectively. Then there exist e£i? and h G T r -\ such that 

A(M,k,d) < —A{Ai/e,k-l,d), (12) 
A(M,k,d) < ^p±A(M\h,k,d). (13) 
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Proof: We only prove ( 12 1, the proof of ( 13 1 being similar. For all / G T and e E E, let x(e, /) = 1 
if e £ / and x(e, /) = otherwise. Let C be a code on the flats of M. with rank k, minimum distance 
d, and cardinality A(A4 , k, d). Then for all e G E 1 , C n (.F/e) can be viewed a code on the flats of 
M./e with rank k — 1, minimum distance at least d, and cardinality E/eC ^( e ' /) < A(M/e, fe — 1, d). 
Conversely, we have E e e£^( e ' /) = l/l — Cfc ^ or a ^ / e ^- Combining, we obtain that there exists an 
element e' E E for which |£?|A(A</e', A; - 1, d) > EeeB E/eC ^(e, /) > c k A(M,k,d). ■ 
Proposition [9] below is a generalization of the Singleton bound for constant-dimension codes derived 
inQ. 

Proposition 9 (Singleton bound): For all A4, < k < r, and any element e G E, we have A(M., k, d) < 
A(M/e, k, d - 1) and hence A(M, k, d) < mhigejr^ \{f G F k+d ^i : g C /}|. 

Proof: Let C be a code on with minimum distance d and cardinality fc, d). For any 

/ G C, we define the puncturing H e (f) as a flat in Fk{M./e) containing /. Then by the lengths of 
the shortest paths on the lattice of flats of M., we have di,(H e (f), H e (g)) > d\^(f,g) — 2 and hence 
d M {H e (f),H e (g)) > d M (f,g) - 1 for all f,g G C. Therefore, {H e (f) : f G C} is a code on the flats 
of .M/e with rank k, minimum distance at least d— 1, and cardinality ^4(.M, /c, d) < A(A4/e, k,d — 1). 
Applying this bound recursively yields the second upper bound. ■ 

We finish this section by noting that the concept of lifting, used to construct good matroid codes for 
RLNC |l|, p3| and SAF [2], could be generalized for any matroid. However, as shown in the case of 
SAF Q, these codes may not be optimal, hence we shall not develop this idea any further. 

C. Matroid codes for the affine geometry 

We are now interested in matroid codes on affine subspaces. By definition, we have A(A, k, 1) = Nf. = 
q-n-k+i rnj -> ^(n-fc+i) f or a n o < A; < n + 1. Since the affine geometry A(q, n) is a submatroid of 
the projective geometry C(q,n + 1), the upper bound on codes on linear subspaces reviewed in Section 



TLB] yields 

A(A(q,n),k,d) < A(C(q,n + l),k,d) < x--l ? n«ji{(n-fcfi)(*-d+l),*(n-*-«t4-2)} - (14) 



As we shall see later, the upper bound on A(A(q,n),k,d) in (14) is tight up to a scalar. However, we 
refine this bound below by applying the Johnson bounds derived in Proposition [8] to codes on affine 
subspaces. 

Proposition 10 (Bounds on codes on affine subspaces): For all 2 < k < n—1 and 2 < d < min{/c, n— 
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k + 1}, we have 

A(A(q,n),k,d) < 
A(A(q,n),k,d) < 

< 



n - k+1 A{C{q,n),k-l,d), 
q n - 1 



y q-n-k+i _ 

q n - 1 
^ qn-k+i _ 



1 



A(A{q,n- l),k,d) 



(15) 
(16) 



„n-l 



1 



~,n—k 



,fc+d-l 



Proof: For any e G GF(q) n and /i G F n (A(q,n)), A(q,n)/e and ^4(g, n) | /i are isomorphic to 
C(q,n — 1) and A(q, n — 1), respectively |j3j Proposition 6.2.5]. Also, every flat with rank k of A(q,n) 
contains q k ~ 1 elements and is contained in [ n ~k +1 ] hyperplanes. Applying Proposition [9] and ( 12 1 and ( 13 1 



in Proposition [8] to A(q,n) hence leads to (15 1 and (16 1, respectively. Finally, applying ( [16] ) recursively 
yields the last upper bound. ■ 



We remark that the first Johnson bound in ( 15 1 is good for 2k < n + 1, while the second Johnson 
bound in ( [To] ) is good for 2k > n + 1. Also, the bounds obtained by applying the Singleton bound in 
Proposition [9] are looser than ( p"5[ ), and are hence omitted. 

Recall that for any M G GF(q) kx ^- k+1 \ the affine lifting of M, hereby denoted as I A {M) G T k , is 
the closure of the rows of (I'JM), where I' k = (0|I fc _i) T G GF(q) kx ^ k - 1 \ Remark that rk(/^(M)) = 
rank(l|I' fc |M) = k for all M G GF(q) kx ( n - k+1 \ hence I A indeed maps GF(q) kx ^ n - k+1 ^ to T k . 
Proposition [TTJ below shows that the affine lifting preserves the distance. 

Proposition 11 (Affine lifting): For any M, N G GF(q) kx ^- k+1 \ we have d M (^(M), Ja(N)) = 
d R (M,N). 

Proof: We have 



rk(iU(M) U Ia(N)) = rank 







M 




1; 


N 



rank 







M 








M-N 



+ rank(M - N) 



since the matrix (l|I' fc ) has rank k, and hence ^(/^(M), I A (N)) = c?r(M,N). ■ 
We now design a class of nearly optimal codes for the affine geometry based on affine liftings of 
Gabidulin codes. Let C be a Gabidulin code on GF(q) kx ( n ~ k+l ^ with minimum rank distance d. Then 



by Proposition 11 its affine lifting /4(C) = {/^(M) : M G C} is a matroid code of A(q, n) with rank 

k, minimum distance d, and cardinality q™™{(n-k+i)(k-d+i),k(n-k-d+2)} _ 

Corollary 3: For all < k < n + 1, we have A(A, k, d) > g min{(n-fc+i)(fc-d+i),fc(n-fc-d+2)}_ 

As a corollary of this construction, we obtain A(A,k,d) > q,min{(n-fe+l)(fc-d+l) ) fe(n-A-d+2)} for a jj 



< k < n + 1. Combining this result with ( 14) and the bounds on the maximum cardinality of constant- 



dimension codes reviewed in Section 



II 



we obtain K„ < 



A(A(q,n),k,d) 
A(C(q,n+l),k,d) 



< 1. Therefore, RANC utilizes 



October 19, 2010 



DRAFT 



27 



codes with a similar cardinality to subspace codes for packets longer by one symbol. This gain is clearly 
illustrated by the definition of affine lifting, which removes the first column from the identity matrix used 
in the linear lifting. The gain in data rate in ([2]) derived for the error-free case is hence preserved when 
error control is implemented. Furthermore, we prove below that this gain comes with no significant cost 
in terms of decoding complexity. 

By construction, the lifting introduced above for RANC is closely related to the lifting introduced in 
(T| for RLNC. We now utilize this relation to design a low-complexity decoding of affine liftings of 
Gabidulin codes. More generally, we prove that decoding the affine lifting of a rank metric code can be 
performed using a subspace distance decoder for the linear lifting of the same code. 

In order to clarify notations, we shall use the subscripts A and C to refer to objects (ranks and lattice 
distances) defined for RANC and RLNC, respectively. We introduce the nonsingular matrix X n+ i = 
(v/tX+i) G GF(g)( n+1 ) x ( n+1 ), where v fc = (1, -1, -1, . . . , -1, 0, . . . , 0) has k nonzero coefficients. 
For any affine subspace M G J r (A(q,n)) of rank k given by the closure of the rows of the matrix 
M G GF(q) kxn , we denote the linear subspace of GF(g) n+1 with dimension k generated by (l|M)X n+ i 
as r(M) G F{C{q,n + \)). Multiplying on the right by X n+ i can hence be viewed as mapping the affine 



subspaces of GF(q) n into linear subspaces of GF(q) n+ . Proposition 12 below shows that this mapping 
preserves the lattice distance, and that the image of the affine lifting of a matrix is the linear lifting of 
the same matrix. 

Proposition 12: For any affine subspace M G F{A(q, n)) and any affine lifting /4(C) G J r (A(q,n)), 
we have d LA (M,I A (C)) = d LjC (r(M), J £ (C)). 

Proof: Since X n+ i is nonsingular, we have Tk A (M) = rank {(l|M)X n+ i} = rkc(r(M)). Also, it 
is easily shown that (l|I' fc |C)X n+ i = (I^.|C), and hence 

, 1 M 

d LA (M,I A (C)) = 2rank , - rank(l|M) - rank(l|I' fc |C) (17) 




2rank — ' — - rank{(l|M)X n+ i} - rank(I fc |C) (18) 

(IfclC) 

= d h , c (r(M),I c (C)), (19) 

where ( 17 ) and ( 19 1 follow the definition of the lattice distance in Q, while ( 18 1 is obtained by multiplying 
by X n+ i on the right. ■ 



By Proposition 12 decoding M using the affine lifting of a Gabidulin code is equivalent to decod- 
ing r(M) using the linear lifting of the same code. We remark that transforming the matrix M into 
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Fig. 4. Implementation scheme for affine network coding 



(l|M)X n+ i can be simply performed by adding a column in front of M, whose value is given by 
1 — Yli=o c °l«> where coli denotes the i-th column of M for all < i < n — 1. Therefore, the decoding 
algorithm for the affine lifting of a Gabidulin code follows two steps: First, obtain a matrix M, and add 
the column 1 — Y^iZq co1 « m front of it; Second, apply the bounded subspace distance decoding algorithm 
in (TJ for the row space of the extended matrix. It is clear that the complexity of this algorithm is on the 
same order as that of the algorithm in [1] for the same Gabidulin code, which is 0(n 2 ) operations over 
GF(q) n ~ k+1 . In order to summarize our results, the proposed implementation scheme for affine network 
coding is illustrated in Figure [4] for 2k < n+ 1. 

VII. Conclusion 

In this paper, we introduced a novel model for the performance study of and noncoherent error control 
for data transmissions through a network. This model, based on flats of matroids, encompasses traditional 
techniques, such as linear network coding and routing, and offers a wealth of alternatives to these 
protocols. We evaluated the performance of these two protocols both in the error-free case and in the 
case where packets are lost, injected, or in error. We then designed a new network coding protocol based 
on the affine geometry which outperforms linear network coding in terms of data rate for the coded and 
non-coded cases. We identified a class of nearly optimal codes, for which we provide a low-complexity 
decoding algorithm. The results are summarized in Table [I] 

This topic opens many directions for future research, some of which are detailed below. First, the model 
we proposed is based on simple assumptions, which may not accurately reflect the reality of the network. 
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/ 




kq'- k 



TABLE I 



Summary of parameters for SAF, RLNC, and RANC. 



Hence, we need to investigate how the specificity of the given network can be incorporated into our model. 
Second, many different types of matroids have been proposed, from the most elementary to the most 
sophisticated. Determining which matroids are desirable for a given situation is an important research 
direction, as it also determines the corresponding protocol. Third, once the matroid is fixed, some tools 
are required to evaluate its performance and to compare it with other matroids. Although we introduced 
some parameters, such as the data rate and the average delay, new parameters may reflect some situations 
more accurately. Fourth, random affine network coding deserves to be investigated in further detail. In 
particular, the implementation of the low-complexity decoding procedure for liftings of Gabidulin codes 
introduced in this paper has a significant impact on the feasibility of affine network coding. Fifth, on a 
more practical approach, combining matroids may take advantage of the original matroids. For instance, 
combining SAF and RLNC may lead to transmitting packets with different priorities, in terms of error 
control, combining matroids may also lead to unequal protection against errors. 
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